Noise limits in the assembly of diffraction data 
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We obtain an information theoretic criterion for the feasibility of assembling diffraction signals 
from noisy tomographs when the positions of the tomographs within the signal are unknown. For 
shot-noise limited data, the minimum number of detected photons per tomograph for successful 
assembly is much smaller than previously believed necessary, growing only logarithmically with the 
number of resolution elements of the diffracting object. We also demonstrate assembly up to the 
information theoretic limit with a constraint-based algorithm. 
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Arbitrarily high levels of noise can be tolerated when 
an unlimited number of measurements are available and 
can be averaged to obtain the signal. A new challenge 
is introduced when the signal is interrogated tomograph- 
ically, that is, by means of multiple sections of the sig- 
nal. If the position of each tomograph within the signal 
is unknown, then each measurement must have at least 
the minimum information required to position the tomo- 
graph, for signal averaging to be feasible. We will refer 
to the signal processing demands posed by this scenario 
as crypto-tomography. 

An instance of crypto-tomography is the assembly of 
diffraction data in the proposed x-ray free electron laser 
(XFEL) investigations of single molecules [lj. The goal 
of these experiments is to obtain the 3D structure of a 
molecule by algorithmically inverting to direct space the 
measurement of its continuous diffraction pattern. Each 
XFEL measurement provides information about one 2D 
tomograph (an Ewald sphere) of the 3D diffraction pat- 
tern. Since the molecular orientation in each measure- 
ment is random, and the total number of photons col- 
lected small, the data assembly problem will test the 
noise limits of crypto-tomography. To help assess the 
feasibility of these proposed experiments, we have inves- 
tigated crypto-tomography in the case of a weak, shot- 
noise limited signal. Because our approach is information 
theoretic [2| , the results apply to any algorithm that aims 
to assemble and average noisy diffraction data. We con- 
clude with some results obtained with an algorithm that 
is able to assemble data close to the information theoretic 
limit. 

To better understand the theoretical issues, we in- 
troduce a minimal, three parameter model of crypto- 
tomography. A sample signal is shown in Figure 1 and 
consists of N one-dimensional diffraction patterns gener- 
ated by N one-dimensional objects. Each of the latter 
comprise M independent, complex- valued resolution ele- 
ments "Fmn. The diffraction signal is given by 



Un(0) = 



M/2 
m—-M/2 



where 9 is the single angle that specifies the position of 
the tomograph. For any 9, the numbers w n {9) are the 
time-integrated photon fluxes recorded at N detector pix- 
els. The third parameter of the model is the mean photon 
count per pixel, fi. We will be interested in the limit of 
large M and N . In this limit, the mean photon count 
is related to the statistics of the ensemble of resolution 
elements: 



Ai/M . 



(2) 



A single exercise in crypto-tomography consists first in 
selecting one M x N set of resolution elements with the 
above statistics; this defines the correct diffraction sig- 
nal. A noisy data set is then generated by repeatedly 
selecting a random 9 and sampling N photon counts fci, 
&2, etc. from Poisson distributions with means given by 
wi(9), W2{9), etc. Every data item thus consists of an 
A-tuple of photon counts. Given an unlimited number 
of such data, we are interested in determining the feasi- 
bility of reconstructing the original signal as a function 
of the model parameters M, N, and fi. 

Huldt et at [3] studied crypto-tomography from the 
perspective of classifying the recorded photon counts. 
For the TV-tuples of data in the model above, a decision 
is made if a pair {k±(9i), . . .} and {ki{9j), . . .} originated 
from different angles, 9i ^ 9j, or nearly the same an- 
gle, 9i « 9j (on the scale 2-k/M since M is the highest 
frequency in the angular variation of the signal). This 
decision is based on the value of the cross-correlation 



N 



c{9 u 9 J ) = Y,kn{9 i )K 



(3) 



= 1,...,A, (1) 



whose expected value distinguishes between counts de- 
rived from a common signal (9i rs 9j) or two independent 
signals 

cu = {cffiiA)) = 2Aa* 2 cy = (c(6i, 6j)) = Ait 2 . (4) 

The averages are with respect to Poisson distributions of 
photon counts with mean values given by diffraction sig- 
nals w with Wilson statistics, p(w) = e^^'^/fi. A pair 
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FIG. 1: Top: A diffraction signal comprising 64 continuous 
one dimensional signals arranged side-by-side. The tomogra- 
phy angle 9 varies vertically. Bottom: Reconstruction of the 
top signal from 10 4 tomographic measurements (horizontal 
sections) of unknown 8. Each measurement was a 64-tuple of 
photon counts with mean count /j, = 0.32. 



of TV-tuples from different angles may be misidentificd 
as originating from the same angle because the cross- 
correlation itself fluctuates. Evaluating the standard de- 
viation a-ij of c( 



( , 6j ) one finds 3] 



4 = N(p z + 4 M J + 3//) 



(5) 



To avoid classification errors we must have Ca — 3> , 
which reduces to the statement 



(6) 



in the limit of small \i. This criterion, however, cannot 
be the fundamental limit since it makes no reference to 
M. As shown below, there is sufficient information for 
assembly with the much smaller number of detected pho- 
tons given by (|32|) . Finally, we demonstrate an assembly 
algorithm, not based on classification, that succeeds in 
this regime of high noise. 



A criterion for the feasibility of crypto-tomography can 
be formulated in terms of the information content of an 
abstract function F that, given a measured set of photon 
counts K = {ki, . . . , kjsr}, checks consistency with the 
reconstructed signal W{9) = {ui\{6), . . . , wn(6)}. For 
the reconstruction to be unique, F must be able to map 
fewer bits of input to a greater number of bits checked in 
the output. This is a restatement of the fact that F has 
access to a unique reconstruction — a positive source of 
information. 

The number of bits in the output of F is the mutual 
information Q I{K, W) associated with the joint proba- 
bility distribution of photon counts and signals: 



p(K, W) =p(W) 



d9_ 

2- 



p(K\W(9)) 



(7) 



Here p(W) is the prior distribution of signals, as specified 
by CQ) and the statistics ((2|), and 



p(K\W(6)) = H 



kJ. 



-»«(») 



(8) 



is the Poisson distribution of photon counts at angle 9 
for signal W. The mutual information I(K, W) gives 
the number of bits of information about W obtained, on 
average, from each measurement K at an unknown 9. 

Given a model reconstruction W, the number of inde- 
pendent consistency checks associated with a measure- 
ment K has a size in bits given by the entropy of 9 that 
remains, on average, after K is known: 



I{K,6) = H{6) - H(6\K) 



(9) 



This too is mutual information, now associated with the 
joint distribution p(K,6) = p (K \W{9)) /2tt. Since we 
are interested in the uniqueness question for average case 
signals, I(K, 9) should be averaged over signals with the 
prior distribution p(W). 

The crypto-tomography criterion can now be written 
down explicitly: 

I{K,W)>{I(K,6)) W , (10) 
where the mutual information expressions 



J 



I(K, W) = J dWp(W)J2p(K\W)log^^- {I(K,9)) W = J dWp(W)J2 J ^ P (K\W (9)) log ^ 



P( K\W(9)) 
(K\W) 



involve © and the marginal distributions p(K\W) = plies about the parameters of our minimal model in the 
Jp(K\W(6))dd/2TT and p(K) = J p(K\W)p(W)dW. limit of large M and N. 
What follows is an analysis of what criterion (fTTj)) im- 
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The definitions above imply that he sum 

I(K,W) + (I(K,9)) W =I(K,W(9)) (11) 

corresponds to the mutual information associated with 
the photon counts K and the signal W(9) at a 
known angle 9 of measurement. The joint distribution 
p(K,W{6)) = p(K\W(6))p(W(9)) involves the Wilson 
distribution of signal values: 

N 

dWp (W(9)) = Yl dw n (9) e~ WnW ^/p . (12) 

n=l 

Evaluating the mutual information we find / (K, W(0)) = 
NI(p) where 



I(p) = (1 + /i) log (1 + p) - 7M - ( J 



k=2 



logfc 



(13) 

In the limit of interest, p — ► 0, I(p) ~ (1 — 7)/i, with the 
result 



I(K,W(9)) ~ (l- 7 )JVp 



(14) 



or about (1 — 7)/ log2 w 0.61 bits per photon (7 is Euler's 
constant). 

We next evaluate (I(K,9))w in the limits of few and 
many detected photons Np. Since Np is in either case 
large in the limit of large N, what matters is the re- 
lationship between Np and the other parameter of the 
model, M. The few photon limit therefore corresponds 
to Np fixed with M — ► 00. In this limit the complete 
prior distribution of signals is sampled by the process of 
sampling a particular, arbitrarily complex signal W(9) at 
different 8. The distribution p(K\W) in the expression 
for (I(K,9))w thus can be replaced by the distribution 
p{K) with the result 

(I(K,9)) W ~I(K,W(0)) (M->oo). (15) 

The limit of many detected photons is an important 
point of reference, where the photon counts K can be 
assigned a unique 9 up to a width defined by a Gaussian 
distribution. Using the symmetry of the mutual informa- 
tion, we can write I(K, 9) = 1(9, K) in the form 

I(K,9) = (^j d9p{9\K)\og2-Kp(9\Kyj . (16) 

In the limit of large Np, the distribution of 9 is a Gaus- 
sian centered at some 9k with standard deviation gk '■ 



logp(9\K)~-\ogJ2ira K 



2a\ 



The resulting mutual information is then given by 




(17) 



(18) 




FIG. 2: Plot of the mutual information scaling function f(x) 
and its intersection with a line of slope (1 — j)/2. 



where the average over K may be taken inside the log- 
arithm because, as we shall see, a K 2 is the sum of N 
independent random terms. 

Since 2np{9\K) = p (K \W{0)) /p(K \W) depends on 9 
only through p (K\W(9)), we have from JS} the equation 

N 

logp(9\K) = [k n togw n (9) - w n (9)} + const. (19) 

71=1 

Using p7|) . we obtain 



logp(0\K)J (20) 

' K 



'K I K 




<{0kY 



71=1 



W„(9 K ) I K 

KM' 

W n (9) 



(21) 



(22) 



where the last step makes use of the fact that the distri- 
bution on 9k associated with the distribution of K is the 
uniform distribution. 

The final step in the many photon limit of (I(K,9))w 
is to average (fT8|) over signals W of the form ([1]). Since 
(|22[) is again a sum of N independent random terms, the 
average over W may be taken inside the logarithm: 



V e \ w (°) l w 



(23) 



The remaining average can be expressed in terms of ran- 
dom variables X and Y, 



where 
X 



W = >' )2 + ? (r,2 + 2|r|2 



M/2 M/2 
m=-M/2 771=1 



(24) 



(25) 
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FIG. 3: Normalized difference map error versus iteration in 
overconstrained (fi — 0.32) and underconstrained (fi = 0.25, 
dashed curve) crypto-tomography reconstructions. 



Y 



M/2 
m=-M/2 



M/2 
m— 1 



(26) 



Since ^ m and are independent, so are X and 

Y. Associated with the arbitrariness of the angle 9 is 
the uniformity of the phases of the \t m ; consequently, 
the phases of X and Y are uniformly distributed and 
((Y*) 2 ) w = {{Y) 2 ) w = 0. Using @ for the third term 
we obtain 



w'{0) 2 
w(0) 



m\ 2 )w 



w 



(27) 



in the limit of large M. 

Combining (J22J) and (f2"T)) . we obtain 



(I{K,0))w 



(28) 

for the many photon limit. We recognize logM as the 
scaling of the entropy of the tomography angle measured 
in speckle units of 2w/M. Both of the limits (| 1 5[) and 
can be expressed in terms of a scaled photon count 



(29) 



X = Nfl/ loo , 

and a dimensionless scaling function 

(I(K,6)) W ~W(z) 



(30) 



which, as shown in Figure 2, requires x > 4.5. We there- 
fore conclude that crypto-tomography for the three pa- 
rameter model is feasible only when 



Nfi > 4.5 log M 



2tt 

G e 



(Nfi) 



(32) 



As an alternative to assembling a diffraction signal by 
first classifying its noisy tomographs, we propose using 
the tomographs as constraints on a de novo reconstruc- 
tion. A general constraint satisfaction algorithm may 
then be able to operate right at the information theo- 
retic limits of feasibility. We now present results obtained 
with the iterative difference map algorithm [4j that sup- 
port this claim. A description of the algorithm and more 
extensive results are given elsewhere. 

To test the criterion (|10[) we generated 10 4 sets of pho- 
ton counts from random tomographs taken from an in- 
stance of the three parameter model with M = 16 and 
N = 64. Inequality ([32|) then implies that crypto- 
tomography is possible only for mean photon counts 
fi > 0.26. An example of a successful reconstruction, for 
the case fi = 0.32, is shown in Figure 1. Figure 3 shows 
the corresponding difference map error 0], a measure of 
the incompatibility of constraints at each iteration. An 
error that remains large during the search for the solution 
of a constraint satisfaction problem is an indication that 
the problem is overconstrained and that the solution will 
be unique. Correct reconstructions were obtained for fi 
as small as 0.29, at which point the search became diffi- 
cult and required very many iterations. For fi < 0.26 the 
behavior of the difference map error changed, decreas- 
ing to zero in few iterations (Fig. 3). This is consis- 
tent with the problem having become underconstrained, 
and in fact reconstructions obtained in this regime never 
agreed with the original diffraction signal. These obser- 
vations suggest that the two forms of mutual information 
in the crypto-tomography criterion (|10p correspond, re- 
spectively, to the numbers of constraints and free vari- 
ables in a constraint satisfaction problem. 

The motivation for this work was prompted by discus- 
sions with Abbas Ourmazd. Support was provided by 
the Department of Energy grant DE-FG02-05ER46198. 



behaving as f(x) <~ (1 — 7)2; for small x and f(x) ~ 1 for 
large x. We have substantiated this claim by evaluating 
(I(K,9))yy numerically for a large range of parameter 
values and find the results are consistent with a single 
function / plotted in Figure 2. 

Using dTTJ), (Till) and ([30]), the criterion (10]) takes the 
form 



(l- 1 )x>2f(x) 



(31) 
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